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Abstract 

Given a directed graph G = iV^E)^ a natural problem is to choose a minimum 
number of the edges in E such that, for any two vertices u and v, if there is a path 
from M to u in E, then there is a path from n to t; among the chosen edges. We show 
that in graphs having no directed cycle with more than three edges, this problem is 
equivalent to Maximum Bipartite Matching. This leads to a small improvement in the 
performance guarantee of the previous best approximation algorithm for the general 
problem. 

1 Introduction 

Let G = {V, E) be a directed graph. The minimum equivalent graph (MEG) problem is the 
following: find a smallest subset S* C £^ of the edges such that, for any two vertices u and 
V, if there is a path from u to f in then there is a path from m to f using only edges in 
S. The problem is NP-hard [Q. A c-approximate solution is a subset of edges providing 
the necessary paths of size at most c times the minimum. A c-approximation algorithm is a 
polynomial-time algorithm guaranteeing a c-approximate solution. 

Moyles and Thompson observed that any solution to the MEG problem decomposes 
into solutions for each strongly connected component and a solution for the component graph 
(the graph obtained by contracting each strongly connected component). Thus, the problem 
reduces in linear time to two cases: the graph is either acyclic or strongly connected. If the 
graph is acyclic, the MEG problem is equivalent to the transitive reduction problem, which 
was shown by Aho, Garey and Ullman to be equivalent to transitive closure Thus, we 
assume the graph is strongly connected, so that the problem is to find a small subset of the 
edges preserving the strong connectivity. We refer to this problem as the strongly connected 
spanning subgraph (SCSS) problem. 

The only known c-approximation algorithm for any c < 2 works by repeatedly contracting 
cycles . Each cycle contracted is either a longest cycle in the current graph, or has length 
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at least some constant k. The set of contracted edges yields the set S. As k grows, the 
performance guarantee of this algorithm rapidly tends to 7r^/6 ~ 1.64. 

A natural modification is to solve the problem optimally as soon as the maximum cycle 
length in the current graph drops below some threshold. The problem remains NP-hard 
even when the maximum cycle length is five, but we conjectured in |^ that it was solvable 
in polynomial time if the maximum cycle length is three. We use SCSS3 to denote the SCSS 
problem with this restriction. In this paper we confirm the conjecture: 

Theorem 1.1 The SCCS3 problem in n-vertex digraphs reduces in 0{n'^) time to Minimum 
Bipartite Edge Cover. 

This gives an 0{n'^ + rn^/n)-iim.e algorithm for the SCCS3 problem, since Minimum Bipartite 
Edge Cover is trivially equivalent to Maximum Bipartite Matching [^], which can be solved 
in 0{m^/n) time [Q. Modifying the cycle-contraction algorithm correspondingly reduces its 
performance guarantee by 1/36: 

Corollary 1.2 For any c > — 1/36 ~ 1.61, there exists a c- approximation algorithm 
for the MEG problem. 

(For graphs with bounded cycle size, a slightly stronger performance guarantee can be shown 
as described at the end of Section ^.) This corollary follows from a straightforward modifi- 
cation to the analysis in p of the algorithms described above. 

Here is an overview of the reduction of SCSS3 to Edge Cover. We classify each edge as 
either necessary (removal of the edge leaves the graph not strongly connected) or redundant 
(otherwise). It turns out that any SCSS consists of the necessary edges together with a set of 
redundant edges sufficient to ensure that each necessary edge lies on some cycle in the SCSS. 
We characterize the manner in which redundant edges can lie on such cycles — specifically, 
each cycle can have at most one redundant edge and each redundant edge lies on exactly 
one cycle (and thus "provides a cycle" for at most two necessary edges). This allows the 
reduction. 

A natural question is whether SCSS3 is fundamentally simpler than Bipartite Edge Cover. 
In Section ^ we show it is not: 

Theorem 1.3 Minimum Bipartite Edge Cover reduces in linear time to SCSS3. 

Comparison to undirected graphs: When the maximum cycle length is three, the 
SCSS problem is as hard as Bipartite Matching. When it is five, the problem is NP- 
hard. When it is seventeen, the problem is MAX-SNP-hard p. The latter precludes even 
a polynomial-time approximation scheme unless P=NP. Thus, digraphs with bounded cycle 
length can have rich connectivity structure. 

This highlights the fundamental difference between connectivity in directed and undi- 
rected graphs. The analogous problem in undirected graphs is to find a minimum-size subset 
of edges preserving 2-edge connectivity. This problem (and many others that are NP-hard in 
general) can be solved optimally in polynomial time for graphs with bounded cycle length [Q. 

Other related work: Moyles and Thompson gave an exponential-time algorithm for 
the MEG problem; Hsu gave a polynomial-time algorithm for the acyclic case. 
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Contents: The body of the paper is organized as follows. Section ^ contains the reduction 
of SCSS3 to Edge Cover (proving Theorem [LT| ). Section ^ notes that Edge Cover reduces in 
linear time to SCSS3, so that (with respect to quadratic time reductions) the problems are 
equivalent. Section ^ describes the application: the improved approximation algorithm for 
the general MEG problem. 



2 Reduction: SCSS3 to Edge Cover 

Let G = {V, E) be a strongly connected digraph with maximum cycle length 3 or less. Assume 
that G has at least four vertices, none of which are cut vertices (that is, vertices whose removal 
disconnects the underlying undirected graph). This is without loss of generality, because 
by standard techniques, in 0{n + m) time, the cut vertices can be found and the graph 
partitioned into 2-connected components. Clearly a c-approximation for each component 
yields a c-approximation for G. 

Definitions 1 An edge is redundant if deleting the edge from G leaves a strongly connected 
graph. Otherwise it is necessary. 

An edge {u, v) is unsatisfied if there is no path from v to u consisting of necessary edges. 

A redundant edge e provides a cycle for an unsatisfied edge {u, v) if there is a path from 
V to u consisting of necessary edges and e. 

Here is an outline of the reduction. Since the necessary edges are in any SCSS, the 
question is which redundant edges to add. It turns out that each redundant edge lies on 
exactly one cycle (Lemma and thus provides a cycle for at most two unsatisfied edges. 



Further, no cycle has more than one redundant edge, so that a set of edges is an SCSS if 
and only if it contains the necessary edges and, for each unsatisfied edge e, a redundant edge 
providing a cycle for e (Lemma |2.2| ). 

We construct an equivalent instance of Edge Cover — an undirected graph G' that has 
a vertex w' for each unsatisfied edge w in G and an edge r' for each redundant edge r in G, 
where r' is incident to w' if r provides a cycle for w. It turns out that the graph G' is acyclic 
(in the undirected sense) and thus bipartite (Lemma ^.3] ). 

Finally, the redundant edges and the graph G' can be computed in O(n^) time (Lemmas 
and 



2.1 Reduction to Bipartite Edge Cover 

Here is the first essential fact: 

Lemma 2.1 Each redundant edge lies on exactly one cycle in G. 

Proof. We use here the assumption that G has at least four vertices, none of which are cut 
vertices. 

Since G is strongly connected, each edge lies on at least one cycle. Suppose for contra- 
diction that some redundant edge {u, v) lies on more than one cycle. There are (at least) 
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Figure 1: No cycle has two redundant edges. 



two distinct paths from v to u. At least one of the paths is of length two. Denote this path 
(f , X, u). 

Since edge is redundant, there is a path Puv from u to v other than edge {u,v). 

Puv must contain x, for otherwise Puv and the path {v, x, u) would form a cycle of more than 
three edges. 

If the edge (f , u) is present in G, then P^v is of length two (as it forms a cycle with (f , u)) 
and hence is the path (m, Thus, in this case, all six possible edges are present between 
the three vertices m, x, and v. Let Vu denote the vertices reachable from u without going 
through V or x. Define and Vx similarly. Using the strong connectivity of the graph and 
its lack of long cycles, one can easily show that these sets are disjoint and have no edges 
between them. Thus, either at least one of m, x, or f is a cut vertex or the graph has only 
these three vertices. This contradicts our assumption about G. 

Thus, the edge (f , u) is not present and there exists a path distinct from (f , x, u) and 
(f , u) from V to u. Denote this path, which must be of length two, by (f , u). The path P^v 
must contain y for the same reason Puv contains x. Thus, there is a path Q, without loss of 
generality from y to x, that does not contain m or f (see Figure |l]). This is a contradiction, 
because the edges (x, -u), (m, f ), and {y, y) would form a cycle of length at least four with the 
path Q. □ 

Lemma 2.2 A set of edges is an SCSS iff it contains the necessary edges and, for each 
unsatisfied edge e, some redundant edge providing a cycle for e. 

Proof. The "if" direction is straightforward. To see the converse, first note that each cycle 
in G contains at most one redundant edge (otherwise each redundant edge would lie on more 



than one cycle, violating Lemma In fact, this also implies that unsatisfied edges are 

not redundant, otherwise we would have a cycle with two redundant edges. 

Since the SCSS strongly connects the graph, any unsatisfied edge must form a cycle with 
the edges in the SCSS. By the preceding observation, this cycle has one redundant edge. □ 
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By Lemma 2^, the problem reduces to identifying a smallest set of redundant edges such 
that for each unsatisfied edge e in G, some redundant edge in the set provides a cycle for e. 
By Lemma each redundant edge provides a cycle for at most two unsatisfied edges. 

Build a graph G' whose vertices correspond to the unsatisfied edges. For each redun- 
dant edge e, if e provides a cycle for two unsatisfied edges, add an edge between the two 
corresponding vertices; if e provides a cycle for one unsatisfied edge, add a self-loop at the 
corresponding vertex. 

By the above discussion, a set of edges in G' forms an edge cover if and only if the 
corresponding set of redundant edges in the original graph, together with the necessary 
edges, form an SCSS. 

So far, we have reduced our problem to Minimum Edge Cover. The next lemma shows 
that the reduction is in fact to Minimum Bipartite Edge Cover. 

Lemma 2.3 G' is bipartite. 

Proof. We will show that the unsatisfied edges in G can be two-colored so that no adjacent 
edges have the same color. This gives the result as follows: color each vertex in G' with 
the color of its corresponding unsatisfied edge in G; by Lemma vertices that share an 
edge in G' correspond to adjacent (and therefore differently colored) unsatisfied edges in G. 
Thus, no edge in G' has two vertices of the same color. 

Assume for contradiction that the unsatisfied edges of G cannot be legally two-colored. 
Then some set G of the edges corresponds to an (odd) cycle in the underlying undirected 
graph. Let edge {u, v) be one of the edges in G. We will show that there is an alternate path 
from u to V, so that (u, v) is redundant. Since unsatisfied edges are not redundant, this is a 
contradiction. 

It suffices to show that, for each edge (a, b) on G, there is a path from 6 to a that does 
not use {u,v). Suppose the return path for (a, &) does contain {u,v). The return path must 
have length two, so either u = b or v = a. 

We consider only the first case; the other is similar. Since u = b, this case reduces to 
finding a path from m to a that does not use (m, f), given that {u,v,a) is a return path for 
(a, u) and that (m, v) and (a, u) are unsatisfied and therefore necessary. 

Suppose edge (f , a) was necessary. Then cycle (a, u, v, a) would consist of necessary edges, 
so none of its edges would be unsatisfied. Thus, (f , a) is redundant. Let Pya be an alternate 
path from v to a. P^a must go through u, for otherwise P^a and the edges (a, u) and (u, v) 
would form a cycle of length more than three. Thus, P^a contains a path from m to a that 
does not go through v. This portion of P^a is the desired path. □ 

2.2 Complexity 

To finish the proof of Theorem |1 . 1|, we show that the reduction can be computed in 0{n^) 



time. 

Lemma 2.4 Classifying the edges as redundant or necessary requires 0{n'^) time. 

Proof. Let G have n vertices and m edges. Fix a root r and find an incoming and an outgoing 
branching (spanning trees rooted at r with all edges directed towards or, respectively, away 
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from r). This can be done in 0{n + m) time using depth-first search. Let B be the union of 
the sets of edges in the two branchings. There are at most 2n — 2 edges in B and the edges 
not in B are redundant. This leaves 0{n) edges to be classified. Classify them using 0{n) 
time per edge as follows. 

Consider an edge {u,v). Enumerate the other vertices to check for alternate paths from 
u to f of length two. If such a path exists, the edge is redundant. Otherwise, check for the 
edge {v,u). If it exists, the edge {u,v) is necessary, because any alternate path from u to 

V would have to have length two and all such paths have been checked. Otherwise, check 
all return paths of length two. If at least two of these paths exist, the edge is necessary by 
Lemma 

Otherwise {u,v) has a unique return path {v,w,u). If an alternate path from m to f 
exists, then it must use w (else we get a cycle of length at least four). Because of the edge 
{w,u), the path from u to w can have length at most two. Similarly, the path from w to v 
can have length at most two. Thus, w and the existence of the paths from u to w and w to 

V can be determined by enumeration in 0{n) time. □ 

Lemma 2.5 Building the graph G' requires 0{n'^) time. 

Proof. Once the edges have been classified as redundant or necessary, the unsatisfied edges 
and the return paths for the redundant edges can be identified in O(n^) time as follows. 
Each edge (m, v) is redundant, or necessary but not unsatisfied, if and only if it has a return 
path of one or two necessary edges. Enumerate each path of one or two necessary edges; let 
u and V be the first and last vertices on the path; if there is an edge (w, m), then either note 
its return path (if it is redundant) or note that it is not unsatisfied (if it is necessary). There 
are Oijn?) such paths, since there are 0{n) necessary edges. □ 



This proves Theorem LI 



3 Reduction: Edge Cover to SCSS3 

The proof of Theorem |1.3| (Minimum Bipartite Edge Cover reduces in linear time to SCSS3) 
is somewhat simpler: 

Proof of Theorem Given an undirected bipartite graph, construct a directed graph as 

shown in Figure ^. Direct all the edges from the first part to the second part. Add a root 
vertex with edges to each vertex in the first part and from each vertex in the second part. 
Any edge cover in the original graph (together with the added edges) yields an SCSS and 
vice versa. □ 



4 Application to the General MEG Problem 

Here we describe the improvement to the approximation algorithm for the general MEG 
problem in 0. As usual, without loss of generality, assume the graph is strongly connected. 
The algorithm in works by repeatedly contracting cycles. Each cycle contracted is either 
a longest cycle in the current graph, or has length at least some constant k. The set of 
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Figure 2: Bipartite Edge Cover reduces to SCSS3. 

contracted edges yields the set S. As k grows, the performance guarantee of the algorithm 
tends rapidly to 7r^/6 ~ 1.64. 

Assume k > 4. Modify the algorithm so that as soon as the current graph has maximum 
cycle size three or less, it solves the problem optimally (using Theorem |1 . 1|) and returns the 
edges in the solution for the current graph together with the edges on previously contracted 
cycles. 

To contract an edge is to identify its endpoints in the graph as a single vertex; to contract 
a cycle is to identify all vertices on the cycle. We use the following result from ||^: 

Proposition 4.1 (||^) // the maximum cycle length in an n-vertex graph is I, then any 
SCSS has at least {n — — 1) edges. 

The proof is that any strongly connected graph can be contracted to a single vertex by 
repeatedly contracting cycles whose edges are in the SCSS; the ratio of edges contracted to 
vertices lost when one of these cycles is contracted is at least £/{£ — 1). 

Proof of Corollary \1.2^ . Initially, let the graph have n vertices. Assume rij vertices remain 
in the contracted graph after contracting cycles with i or more edges (z = fc, — 1, . . . , 4). 
Finally, we get a graph H (with 77,4 vertices) that has no cycles of length four or more; the 
algorithm solves the SCSS problem for H optimally. 

How many edges are returned? Let OVT{G) denote the minimum size of an SCSS of 
G. In contracting cycles with at least k edges, at most -j^{n — Uk) edges are contributed 
to the solution. For 4 < i < k, in contracting cycles with i edges, -jzii^i+i ~ "^i) edges are 
contributed. The number of edges returned is thus at most 

k 

-[n - Uk) 

A little work shows this is equal to 
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Since OVT{H) < 2(^4 — 1), substituting for gives the upper bound 

Clearly OVT{G) > n — 1. For 4: < i < k, when rii vertices remain, no cycle has more than 



i — 1 edges. By Proposition |4.1| , any SCSS of the current graph (and therefore any SCSS of 
G) has at least (n^ - - - 2) edges. Also OVT{G) > OPT{H). Using these three 
facts, the above quantity, divided by OVT{G), is less than 

k ^ 1 1 

fc-l+i^(z-l)(z-l)"^3 

1 1 



k-1 tli"^ 36 



i=l 



Using the identity (from p. 75]) J2iZi ^ = this is equal to 



1 1 ^1 



6 36 A; — 1 ^ 



1 


1 


36 


n-i 




1 

~ 36 ^ 


6 




1 


6 


-s + 




1 

-36 + 


6 



71 ± i ^ ± 

- Y ~ 36 ^ k - 1 ~ {i + 1) 
1 1 



- 1 k 
1 



A;(A;- 1) 

□ 

Similarly to 0, standard techniques can yield more accurate estimates, e.g., ^ — ^ + 
+ (^^^ . Also following 0, if the graph initially has no cycle longer than £ {i > k), then 

the analysis can be generalized to show a performance guarantee of ^^J^-i^ + J2i=i ^ ^ ^q- 
For instance, in a graph with no cycle longer than 5, the analysis bounds the performance 
guarantee (when k = 5) hj 1.396. 
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